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United States Patent and Trademark Office 



This is in response to the appeal brief filed 1 1/14/05 appealing from the Office action 
mailed 7/25/05. 
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Art Unit: 2655 

(1) Real Party in Interest 

A statement identifying by name the real party in interest is contained in the brief. 

(2) Related Appeals and Interferences 

The examiner is not aware of any related appeals, interferences, or judicial 
proceedings which will directly affect or be directly affected by or have a bearing on the 
Board's decision in the pending appeal. 

(3) Status of Claims 

The statement of the status of claims contained in the brief is correct. 

(4) Status of Amendments After Final 

The amendment after final rejection filed on 9/15/05 has been entered. 

(5) Summary of Claimed Subject Matter 

The summary of claimed subject matter contained in the brief is correct. 

(6) Grounds of Rejection to be Reviewed on Appeal 

The appellant's statement of the grounds of rejection to be reviewed on appeal is 
correct. 



(7) Claims Appendix 
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The copy of the appealed claims contained in the Appendix to the brief is correct. 

(8) Evidence Relied Upon 

5,418,717 Suetal. 5-1995 

4,868,750 Kucera et al. 9-1 989 

(9) Grounds of Rejection 

The following ground(s) of rejection are applicable to the appealed claims: 
Claim Rejections - 35 USC § 102 

1 . The following is a quotation of the appropriate paragraphs of 35 U.S.C. 102 that 
form the basis for the rejections under this section made in this Office action: 

A person shall be entitled to a patent unless - 

(b) the invention was patented or described in a printed publication in this or a foreign country or in public 
use or on sale in this country, more than one year prior to the date of application for patent in the United 
States. 

2. Claims 1-3 and 6-8 are rejected under 35 U.S.C. 102(b) as being anticipated by 
Su etal, (US Patent 5,418,717). 

3. Regarding claim 1 , Su e^ a/, disclose a method of generating a score for a node 
identified during a parse of a text segment, the method comprising: 
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identifying a phrase level for the node {decomposing sentence into ptirase levels, 
Col. 13, Lines 22-33); 

identifying a word class for at least one word that neighbors a text spanned by 
the node {analyzing input text for part-of-speech, Col. 9, Lines 32-40; examining words 
to the left and right of the current word, Col. 9, Lines 44-49) and 

generating a score by determining a mutual information metric based on the 
phrase level and the word class {composite scores, Col. 5, Lines 8-14; syntactic score, 
Col. 9, Lines 32-40 and Col. 13, Lines 34-37; a conditional probability may be obtained 
for a syntactic score of a symbol based on its right and left contexts. Col. 1 3, Lines 40- 
44) 

4. Regarding claim 2, Su et al. disclose identifying a word class for a word to the left 
of the text spanned by the node and identifying a word class for a word to the right of 
the text spanned by the node {examining words to the left and right of the current word. 
Col. 9, Lines 44-49). 

5. Regarding claim 3, Su et al. disclose generating a score based on the phrase 
level of the node, the word class of the word to the right of the text spanned by the node 
and the word class of the word to the left of the text spanned by the node {composite 
scores. Col. 5, Lines 8-14; syntactic score, Col. 9, Lines 32-40 and Col. 13, Lines 34-37; 
examining words to the left and right of the current word, Col. 9, Lines 44-49). 
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6. Regarding claims 6 and 8, Su et al. further disclose identifying all possible word 
classes for at least one word, for a word to the left of the text spanned by the node and 
for a word to the right of the text spanned by the node {part-of-speech, categories of 
prior words, Col. 1 1 , Lines 44-50; examining words to the left and right of the current 
word, Col. 9, Lines 44-49; examining context information near the current word, Col. 1 1 , 
Lines 52-54). 

7. Regarding claim 7, Su et al. disclose generating a score based in part on all of 
the identified word classes {lexical score and probability of a word having a category or 
part'Of-speech, Col. 11, Lines 46-50). 



Claim Rejections - 35 USC § 103 



8. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 
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9. Claims 5, 10, 12-19 and 21-22 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Su et al. ((US Patent 5,418,717) in view of Kucera et al. (US Patent 
4,868,750). 

10. Regarding claims 10 and 19, Su et al, further disclose a parser and computer- 
readable medium for generating a syntax structure from a text segment, comprising: 

a seeding unit for inserting words from the text segment into a candidate list as 
nodes {parsing using grammatical relevancy, CoL 2, Lines 13-21; storing candidates 
most likely to be correct, CoL 6, Lines 6-9 and 1 1 -1 7); 

a node selector for promoting nodes from the candidate list to a node chart 
{storing candidates most likely to be correct, Col. 6, Lines 6-9 and 11-17); 

a rule engine for combining nodes in the node chart to form a larger node 
{parsing using grammatical relevancy, Col. 2, Lines 13-21; score function engine, Col. 
10, Lines 48-53, Fig. 3A, element 31 1) and 

a metric calculator for generating a score for a node formed by the rule engine, 
the score being based in part on mutual information determined based on a phrase level 
of the node formed by the rule engine and at least one word in the text segment {scores 
determined at node positions, Col. 4, Lines 66-68; Col. 13, Lines 22-44; Fig. 4; a 
conditional probability may be obtained for a syntactic score of a symbol based on its 
right and left contexts, CoL 13, Lines 40-44) and 
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using the score for the syntax node when forming the full parse structure 
(generating and truncating syntax trees on the basis of node scores, Col. 4, Lines 61- 
68). 

However, Su et al. do not disclose but Kucera et al. suggest the score being 
based in part on mutual information {collocational probability, Col. 2, Lines 28-34). 

Therefore it would have been obvious to one ordinarily in the art at the time of the 
invention to supplement the teachings of Su et al. with generating a score by 
determining a mutual information metric, as suggested by Kucera et al., in order to have 
the adjacent words and context regarding the node affect the scoring process and 
produce a more accurate and complete node score. 

1 1 . Regarding claims 5, Su et ai do not disclose but Kucera et al. do disclose 
determining a mutual information metric comprises determining a mutual information 
metric based on the phrase level of the node, the word class of the word to the right of 
the text spanned by the node and the word class of the word to the left of the text 
spanned by the node {phrase identification, Col. 3, Lines 3-1 1 ; ranking per phrase 
boundaries, Col. 25, Lines 27-33; adjacent tags, Col. 1, Lines 51-55; Col. 1, Line 65- 
Col. 2, Line 3; major class headers for tags, Fig. 4). 

Therefore it would have been obvious to one ordinarily skilled in the art at the 
time of the invention to supplement the teachings of Su et al., with determining a mutual 
information metric based on the phrase level of the node, the word class of the word to 
the right of the text spanned by the node and the word class of the word to the left of the 
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text spanned by the node, as taught by Kucera et al., in order to include relevant context 
information in the node score metric and thus ascertain a more accurate and complete 
node score. 

12. Regarding claim 12, Su et al. do not but Kucera et al. do disclose the mutual 
information is determined based on a word class for a word in a text segment 
{determining probable tags in order of likelihood, Col. 1 , Line 65 - Col. 2, Line 3; major 
class headers for tags, Fig. 4). 

Therefore it would have been obvious to one ordinarily skilled in the art at the 
time of the invention to supplement the teachings of Su et al. with determining the 
mutual information based on a word class for a word in a text segment, as disclosed by 
Kucera ef a/., in order to account for relevant grammatical information in the scoring 
procedure, thus resulting in a more accurate and informed node score. 

13. Regarding claim 21 , Su et ai disclose the mutual information score is further 
based on all possible word classes of a word in the text segment (a conditional 
probability may be obtained for a syntactic score of a symbol based on its right and left 
contexts, Col. 13, Lines 40-44; Col. 17, Lines 47-66, Figure 7; Col. 11, Lines 46-50). 

14. Regarding claim 13, Su et ai do not disclose but Kucera et al. do disclose the 
mutual information is determined based on all possible word classes for a word in the 
text segment {annotating each word with possible tags, Col. 1 , Line 65 - Col. 2, Line 3). 
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Therefore it would have been obvious to one ordinarily skilled in the art to 
supplement the teachings of Su et al. with having the mutual information determined 
based on all possible word classes for a word in a text segment, as taught by Kucera et 
al,, in order to achieve a better parsing result by considering all the reasonable word 
class possibilities. 

15. Regarding claims 14-15 and 22, Su et al. do not disclose but Kucera et al. do 
disclose the mutual information is determined based on a word class for a word to the 
left/right of a set of words spanned by the node formed by the rule engine {adjacent 
tags, Col. 1, Lines 51-55; Col. 1, Line 65 - Col. 2, Line 3; major class headers for tags, 
Fig. 4). 

Therefore it would have been obvious to one ordinarily skilled in the art at the 
time of the invention to supplement the teachings of Su et al., with determining a mutual 
information metric based on a word class for a word to the left/right of a set of words 
spanned by the node formed by the rule engine, as taught by Kucera et al., in order to 
include relevant context information in the node score metric and thus ascertain a more 
accurate and complete node score. 

16. Regarding claim 16, Su et al. further disclose a lexicon look-up for determining 
parts of speech for words in the text segment (Col. 5, Lines 65-68). 
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17. Regarding claim 17, Su ef a/, do not explicitly disclose but Kucera et al. do 
disclose the seeding unit inserts a node for each part of speech of each word in the text 
segment {annotating each word with possible tags, Col. 1 , Line 65 - CoL 2, Line 3; 
nodes, CoL 2, Lines 43-46; inserting tags in node structures, Col. 26, Lines 25-28; Fig. 
13, step 180). 

18. Regarding claim 18, Su et al, further disclose the seeding unit inserts nodes 
representing the beginning of the text segment and the ending of the text segment 
{terminal nodes, Col. 13, Lines 26-31). 

(10) Response to Argument 

1. Regarding claims 1-3, 6, and 8, Applicant argues that Su does not suggest 
employing a mutual information metric as part of the parsing process being that Su 
never refers to the disclosed Syntactic Score (see Su, Col. 13, Lines 6-61) as mutual 
information. The Examiner agrees that Su does not denominate the Syntactic Score 
disclosed as mutual information. However, this does not impede the Syntactic Score 
disclosed by Su of its ability to suggest a mutual information metric for a node based on 
a phrase level and a word class, as disclosed by Applicant. It is well known in the art 
that the mutual information / between two events X and Y contains terms of conditional 
probabilities and is calculated as follows: 
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I (X; Y) = Ix,y p(y)*p(x|y) * [log[ (p(x|y)) / p(x))]]. 

{conditional probabilities p(x,y) emphasized). This expression for mutual information is 
commonly simplified and further expressed as the following equivalent expression (see 
similar expression in Applicant's Arguments, Page 6): 

I (X; Y) = Ix,y p(x,y)* log [(p(x,y)) / (p(x)p(y))]. 

Thus, mutual information, by its very nature, includes a conditional probability 
measurement of the events in question. This is, de facto, evidenced by Applicant's 
disclosure in Page 20, Lines 1-21 (see Eq. 5), wherein Applicant discloses that for the 
case in which all the possible classes for the words left and right of the node in question 
are taken into consideration, the mutual information metric is expressed as: 

l(node) = "Zleft classes Iright classes P(WCi\WOrdi) P(WCr\WOrdr) I (WCf, PUode. WCr). 

Note conditional probability terms P(wci|wordi) and P(wCr|wordr). [These additional 
elements to be factored into the mutual information metric are particular to the 
embodiments of claims 6 and 7, which depend on and further limit claim 1 .] Su's 
Syntactic Score, like Applicant's mutual information score, is not a mere conditional 
probability measure but a metric partly composed of conditional probabilities, refer to 
Col. 13, Lines 34-40, wherein Su discloses the syntactic score may be calculated as the 
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product of probabilities and that these probabilities may be simplified into conditional 
probabilities. Su's Syntactic Score suggests Applicant's Mutual Information Score 
because a statistical measurement is being made in order to determine a likelihood of 
occurrence (a scoring based on compliance of a node in question with respect to 
grammatical rules) for a node {each higher node or phrase level being a sequence of 
tagged or classified words added upon as the parsing progresses) given the context 
{i.e. words right and left of the node and their respective possible classifications), and 
the phrase level of the node, therefore said statistical measurement being necessarily 
based on conditional probabilities and in both cases having the same objective. 
[Conditional probability is the probability of some event A, given that some other 
event, S, has already occurred.] Thus, in order to obtain a combined likelihood of the 
joint occurrence of all the elements/events being parsed, conditional probabilities are 
calculated with respect to all the possibilities, and compounded to form a metric that 
yields a rank for each possibility. Furthermore, compare the terms li {left word), ri 
{right word), Ci {lexical category of word 1) and Ls, Lj, etc. {phrase level of the node) in 
Col. 13, Lines 45-60 of Su's disclosure to the terms wq {word class of left word), wcr 
{word class of right word), wordi, and PLnode {phrase level of the node) of Applicant's 
disclosure in Page 20, Eqs. 3, 4 and 5. In both cases, in order to account for multiple 
class possibilities for each of the chosen number of words left and right of the node, as 
well as for previously occurring phrase levels, a compounded statistical metric results, 
that is neither a simple mutual information metric nor conditional probability. Moreover, 
while the specific mathematical definitions of Applicant's mutual information 
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metrics/scores are not being claimed, the broader mutual information per se is a well- 
known statistical tool, which is commonly used in corpus linguistics. 

2. Regarding claims 5, 10, 12 and 14-19, Applicant argues that neither Kucera nor 
Lu suggest generating a mutual information metric based on a phrase level, since 
collocational probability does not suggest mutual information. The Examiner cannot 
concur with the Applicant. Kucera suggests generating a mutual information metric 
based on the phrase level (see Kucera, collocational probability, Col. 2, Lines 28-34). 
Collocation information and mutual information are well-known in the art as statistical 
alternatives that fulfill a part-of-speech disambiguation purpose, therefore the use of a 
collocation probability can suggest the use of a mutual information metric in the 
corresponding art. 

3. Regarding claim 7, Applicant argues that neither Su nor Kucera disclose 
'generating a score based [in part] on all of the identified word classes', and that instead 
Su discloses generating separate scores, which are never combined. The Examiner 
cannot concur with the Applicant. Su discloses that the lexical score is a component of 
a higher-level score function (see Su, Col. 11, Lines 42-44, and lexical score and 
probability of a word having a category or part-of-speech, Col. 11, Lines 46-50; Col. 11, 
Lines 6-15), and discloses that all possible parts-of-speech for a word are considered 
(see weighted lexical category, finding additional lexical categories, and retaining the 
highest ranked sequence, Su, Col. 18, Lines 1-49). The lexical score disclosed by Su 
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generates a score that is based in part on all of the identified word classes because, 
firstly, all the possible lexical categories for a word are accounted for, and, secondly, 
when obtaining the final score for a word sequence, the lexical categories for each word 
in the sequence have been weighted, so that the final score reflects a level of certainty 
of the selected tagging/parsing. That is, if the word Yose' has three possible categories, 
noun, verb or adjective, with probabilities of 0.60, 0.30 and 0.10 respectively, given the 
context {or master text), the chosen category will be the one with the highest probability, 
i.e. noun, and because it is a weighted lexical score the overall score for the node in 
question will therefore reflect the fact that there is only a 60% probability of noun being 
the proper classification for Yose' given the context. Therefore, the score generated by 
Su is a score that is based in part on all possible word classes identified for a word. 

4. Regarding claims 13, and 21-2, Applicant argues that neither Su nor Kucera 
disclose mutual information is determined based on all possible word classes for a word 
in the text segment. The Examiner cannot concur with the Applicant, Both Su and 
Kucera disclose said limitation [For Su's disclosure see explanation above regarding 
Claim 7]. Kucera discloses annotating each word with a sequence of possible 
grammatical tags (see Kucera, Col. 1, Line 65) in similar fashion to Su. Furthermore the 
tags are ranked in order of likelihood, and all classes/tags for a word are accounted for 
in the likelihood ranking because for a word with two possible classifications a 
preference for one classification will result in proportionate rejection of another. 
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Therefore all the possible identified classes for a word are used in generating the final 
likelihood or score. 

For the above reasons, it is believed that the rejections should be sustained. 
(11) Related Proceeding(s) Appendix 

No decision rendered by a court or the Board is identified by the examiner in the 
Related Appeals and Interferences section of this examiner's answer. 
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